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(57) Abstract 

Oligonucleotide sequences are pro- 
vided coding for T-cell-specific antigen re- 
ceptors or fragments thereof. The oligonu- 
cleotide sequences can be used as probes for 
detecting helper and cytotoxic T-cells, pre- 
paring and isolating DNA sequences encod- 
ing for the receptor polypeptide, and in 
constructions for expression of receptor 
polypeptides or fragments thereof. In addi- 
tion, processing signals from the receptor 
subunits can be employed in conjuction 
with modified wild type oligonucleotide se- 
quences or non-wild type olignucleotide se- 
quences. 
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T-CELL RECEPTOR-SPECIFIC FOR ANTIGEN POLYPEPTIDES 
AND RELATED POLYNUCLEOTIDES 

5 BACKGROUND OF THE INVENTION 

Field of the Invention 

The hematopoietic system is extraordinarily 
complex, which is not surprising in view of the central 
role blood cells play in the maintenance and survival 

10 of the host. One aspect of great importance is the 

manner in which the host protects itself from various 
pathogens. Two families of cells play a salient role 
in protecting the host, B-cells and T-cells. 

The mystery of how the B-cells are able to 

15 produce an extraordinary variety of immunoglobulins has 
been explained to a substantial degree. The germline 
DNA is now known to undergo rearrangements, so as to 
join various exons together to produce a variable 
region which is then joined to differing constant 

20 regions as the B-cell matures. The mechanism by which 
the DNA undergoes the rearrangement and the subsequent 
transcript is spliced to produce a messenger RNA coding 
for a specific immunoglobulin has been an exciting 
adventure demonstrating the potency of the tools 

25 afforded by the developments in molecular biology. 

Another class of cells important to the 
immune system of the host is the T-cells. These cells 
differ from the B-cells in that they do not secrete 
immunoglobulins, although they appear to have a. similar 

30 range of antigenic specificities. Particularly, helper 
T-cells, which are involved in stimulating B-cell 
proliferation, can have specificity analogous to that 
of B-cells, with the additional requirement that they 
must also recognize self-major histocompatability 
.35 determinants simultaneously. 

The specificity of T-cells can find 
application in a wide variety of situations. If one 
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could modify a helper T-cell by introducing a foreign 
receptor site, one could change the response of the 
host to a foreign antigen. Furthermore, in many 
situations, it may be of interest to determine whether 
5 a cell is a helper T-cell or other type of cell. In 
addition, one has the opportunity to determine 
monoclonality in the host, which can be useful in the 
diagnosis of T-cell leukemias. Also, having DNA 
sequences which encode for portions of the T-cell 

10 antigen-specific receptor would allow for constructions 
involving the combination of native T-cell sequences 
with foreign sequences to produce novel proteins which 
could act as receptors. Also, antisera and monoclonal 
antibodies could be generated against specific parts of 

15 the protein, using either synthetic peptides or 

producing the protein in an expression vector. By 
employing hybridization with DNA sequences, subsets of 
T-cells may be determined as well as genetic . 
differences and defects. 

20 Description of the Prior Art 

The second domain of HLA-DC has been shown to 
be homologous to immunoglobulin. Auffray et al. , Proc. 
Natl. Acad. Sci. USA (1982) 79:6337-6341. The sequence 
about the intrachain disulfide bond in the 

25 immunoglobulin variable region is discussed by Kabat et 
al. r in Sequences of Immunological Interest , U.S. Dept. 
of Health and Human Services, Washington, D.C. (1983) . 
Cross-reactivity between B-cell anti-idiotypic antisera 
and T-cells is reported by Eichmann and Rajewsky, Eur. 

30 J. Immunol. (1975) 5^:661-666; Binz and Wigzell, J. Exp. 
Med. (1975) 142 :197-211, and Augustin et al. , in 
Regulatory T Lymphocyte (eds. Pernis and Vogel) 
171-184, Academic Press, N.Y., 1980. Lack of 
nucleotide sequence similarity between T-cell specific 

35 genes and immunoglobulin coding genes is reported by 
Kronenberg et al. , J. Exp. Med. (1980) 152:1745-1761 
and Kronenberg et al. , ibid. (1983) 158 : 210-227 , among 
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others. Murine T-cell specific proteins are reported 
by Kappler et al . , Cell (1983) 34_:727-737 and Mclntyre 
and Allison, ibid. (1983) 34^:739-746 . Allison et al. , 
J. Immunol. (1982) 129 ; 2293-2300 ; Haskins et al . , 
5 Exp. Med. (1983) 157 ; 1149-1169 ; Meuer et al. , Nature 

(1983) 303 :808-810 and Samuelson et al . , Proc. Natl. 
Acad. Sci. USA (1983) EKH: 6972-6976 report the 
immunoprecipitation from T-cells of a disulfide linked 
heterodimer composed of two distinct glycoproteins of 

10 37-50kd in size. Mclntyre and Allison, supra (1983) 
and Acuto et al. , Cell (1983) 34:717-726 report that 
the heterodimer appears to have variable and constant 
portions by peptide map analyses. Heber-Katz et al. , 
J. Exp. Med. (1982) 155 :1086-1099 and Hedrick et al. , 

15 Cell (1982) .30: 141-152 report the production of 

MHC-restricted T-helper hybridomas, which disclosure is 
incorporated herein by reference. Davis et al. , in B 
and T Cell Tumors , UCLA Symposium Vol. 24 (eds. Vitteta 
and Fox) 215-220, Academic Press, N.Y. 1982, report 

20 that T and B lymphocytes differ by a very small 
fraction of their gene expression. 

Saito, Nature (1984) 309 :757-762 , reports a 
T-cell-specif ic cDNA clone which is rearranged in 
cytotoxic T-cell DNAs and has variable, constant and 

25 joining region homologous elements. Siu et al. , Cell 

(1984) 32:393-401 and Kavaler et al . , Nature (1984) 
310 : 421-423 , report the presence of diversity elements 
in the g-chain. The ct-chain of T-cell receptor 
molecules has been reported to be as diverse as the 

30 0-chain (Kappler et al. , Cell (1983) 35:295-302). 

SUMMARY OF THE INVENTION 
A technique is provided whereby rare 
messenger RNA is isolated. By means of this technique, 
35 DNA sequences encoding for antigen-specific receptors 
in T-cells are obtained as well as other T-cell 
specific gene products. The DNA can be used in a 
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variety of ways, such as nucleotide probes, combining 
with foreign DNA sequences to produce novel T-cell 
rece ptors, which can be used in an analogous manner as 
antibodies, or constructs can be provided which provide 
5 for extrachromosomal elements or integration into a 

host genome, where the hybrid proteins may be expressed 
and transported to the membrane. 

BRIEF DESCRIPTION OF THE DRAWINGS 

10 Figure 1 is a restriction map of T-cell 

antigen receptor fragments, where shaded areas indicate 
major homologies between the different cDNA clones. 
Sequencing was by the procedure of Maxam and Gilbert, ; 
Proc. Natl. Acad. Sci. USA (1977) 74:560, with thick 

15 arrows representing 3 '-end labeling (Klenow) and the 
arrows 5 '-end labeling (polynucleotide kinase). 

Figure 2 shows the complete nucleotide 
sequence of 86T1 and partial sequences of the other 
cDNA clones, indicating the 5 ' -untranslated region 

20 (UT) , the leader polypeptide, variable, joining, and 
constant regions, with the numbering following the 
amino acid sequence of 86T1 and possible carbohydrate 
attachment sites (CHO) (N-X-S or N-X-T) noted. 

Figure 3 shows the sequencing strategy of 

25 TT11 cDNA clones, with thin lines indicating 5 1 -end 
labeling with polynucleotide kinase and thick lines 
indicating 3' -end labeling with the Klenow fragment of 
DNA polymerase I. It also shows the nucleotide 
sequence, predicted amino acids, and indicates 

30 generally the individual regions. "CHO" indicates 

potential signal sequences for N-linked glycosylation. 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
In accordance with the subject invention, 
35 novel DNA sequences are provided involving in-whole or 
in-part coding sequences for antigen-specific T-cell 
receptors or fragments thereof, specifically involving 
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functional regions, which may be found on one or more 
exons in the germline and rearranged DNA or in-whole or 
in-part as cDNA from a mature messenger RNA. 

The mammalian T-cell receptors appear to be 
5 80-90kdal heterodimers , which are disulfide linked, and 
composed of two distinct glycoproteins of about 40 to 
50kd (kilodaltons) , referred to as the a- and 
6-subunits. The two glycoproteins have variable and 
constant regions or domains by peptide map analysis, 

10 The DNA sequences encoding for the 

glycoproteins of the heterodimer are divisible into 
variable, joining and constant regions, analogous to 
immunoglobulins, as evidenced by the sequences having 
significant homology with the immunoglobulin sequences 

15 and by the independent assortment of the J-like 
elements • Each of the subunits appears to have 
diversity (D) regions comparable to the heavy chain of 
immunoglobulins . 

The a- and 6-subunits have many similarities 

20 between themselves, other T-cell membrane proteins and 

> immunoglobulins or B-cell receptor proteins. For the 

I 

most part, the overall homology is low with few 
similarities of either amino acid sequence or 
nucleotide sequence in the constant regions. (The 

25 methionine of the leader peptide will be used as 1 for 
the amino acid sequence.) The cysteine spacing is 
found to be between about 65 to 70 amino acids in the 
variable region (a-65; 6-69; IgX or k-65) . In addition 
the sequence "WYRQ" in the variable region of the 

30 a-chain at about residue 55 finds analogy in the 

S-chain in "WYKQ" and analogous sequences at comparable 
positions in immunoglobulins. In the variable regions, 
the sequence "DSA-Y-CAV" is found in the region of 
residues of about 100-115, with one or two differences 

.35 in amino acids. 

The J region appears to be the most highly 
conserved with 7 of 16 residues of the a-chain the same 
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as the B-chain and significant homology with consensus 
sequences of murine heavy and light chains. 

Also characteristic of the T-cell receptor 
sequences is the sequence "ILLXK" where X is L or G, 
5 having a basic amino acid in the transmembrane region. 

In support of a D or diversity region, 5' to 
the "SGN" sequence at about residues 115 to 120, is the 
nucleotide sequence "G 5 " . In 7 of 14 S-chain putative 
D regions, runs of "G^y" are found on the 3 '-side, 
10 which finds analogy in immunoglobulin heavy chain D 
regions . 

The a- and B-chains are encoded in germline 
DNA which is subject to rearrangement to provide a 
transcript which may be further processed. Either the 

15 genomic DNA may be used or cDNA from the mature 
transcript for purposes described hereinafter. 

Each of the chains has from 3 to 5 , usually 4 
to 5 N-glycosylation sites, where some or all of them 
may be employed. 

20 The two chains of the heterodimer are 

different and appear to be derived from different gene 
loci. The sequences for a S-chain and an a-chain are 
set forth in Figures 2 and 3, respectively. The chains 
may be divided up into regions associated with specific 

25 exons by analogy to immunoglobulins. The primary 

regions are the leader region r variable region (V) , 
diversity region (D) , which may be part of V, the 
joining region (J) , the constant region (C) , the 
transmembrane region (TM) and the cytoplasmic region 

30 (Cy) . 

The a-chain without glycosylation will be 
about 25 to 30kD (kilodaltons) , while the S-chain will 
be about the same or larger, being about 25 to 35kD. 
With glycosylation the subunits will be about 35 to 
35 50kD each, usually 40 to 50kD each, providing a 
sulfhydryl linked heterodimer of 80 to 90kD. 
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For each of the subunits, the rearranged DNA 
in helper T-cells, including introns will generally be 
approximately 6 to 8kbp, with individual exons being 
substantially smaller and approximating the size of 
5 cDNA sequences for domains plus whatever flanking 
regions are included. 

The DNA sequence coding for the constant 
region (including the transmembrane and cytoplasmic 
region) will generally be about 400 to 600nt 
10 (nucleotides), plus about 300nt of 3 1 untranslated 
regions. These sequences will be characterized by 
having codons encoding for intrachain disulfide 
linkages between cysteines spaced apart about 100 to 
200nt, usually about 100 to 150nt. 
15 In conjunction with the constant region is a 

probable transmembrane sequence, primarily including 
hydrophobic amino acids and having from about 45 to 
105nt. This sequence will define the 3 '-terminus of 
the constant region and may include about 5 to 15 
20 codons (15 to 45nt) for amino acids which extend into 
the cytoplasm of the cell. 

The next region or domain which appears to 
have functional significance is the region analogous to 
the J region of immunoglobulins. As is known with 
25 immunoglobulins, there are a plurality of J regions, 
adjacent to a given C region ranging about 1 to 6 in 
number, more usually about 4 to 5 . The J regions 
encode about 15 to 20 amino acids, the 8-chain with 16 
amino acids having greater similarity to the 
30 immunoglobulin heavy chain J regions, in that the heavy 
chain J regions are typically 17 amino acids, while the 
light chain J regions are typically 13 amino acids. 

The J regions can be used in conjunction with 
constant regions which may or may not include the 
35 transmembrane sequence and cytoplasmic sequence to be 
joined to other DNA sequences, e.g., non-wild 
sequences, to produce hybrid sequences to allow for 
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novel hybrid proteins on T-cell surfaces. (By 
"non-wild" is intended a sequence other than the 
wild-type or native sequence, while "foreign" intends 
from a source which does not normally exchange genetic 
5 information with the source of the T-cell antigen 
receptor DNA sequence*) 

In order to have the hybrid proteins 
transported to the surface, the secretory leader 
sequence present with the T-cell antigen receptor is 

10 employed as the 5" -terminus. The sequence is of about 
15 to 25 amino acids, more usually 18 to 24 amino 
acids. Thus constructs can be prepared, where various 
domains of the T-cell antigen receptor DNA sequence, 
which may include non-coding flanking regions, are 

15 separated by non-wild type DNA. 

Novel DNA constructions can be employed for 
cloning and/or expressing the T-cell antigen receptor, 
individual subunits, fragments thereof, or combinations 
of fragments with non-wild DNA, including foreign DNA, 

20 to produce hybrid proteins. The fragments will be of 
at least about 15nt, usually at least about 50nt. 
These constructions will for the most part have the 
following formula: 

25 (RS) - (M) b - (tis) - (eis) - (T-AgR) - (ets) - (tts) 

wherein: 

"RS" indicates a replication system which may 
be derived from prokaryotes or eukaryotes, plasmids, 

30 phage, or viruses, where one or more replication 

systems may be involved which allow for replication in 
different hosts, e.g., unicellular microorganisms, and 
maintenance as extrachromosomal elements; illustrative 
replication systems or vectors include lambda, simian 

35 virus , papilloma virus, adenovirus, yeast 2my plasmid, 
ColEI, pRK290, pBR322 f pUC6 , or the like, where the 
replication system may be complete or may be only a 
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partial replication system, interacting with a helper 
plasmid or one or more genes present in the genome, 
e.g., COS cells? for cloning, a replication system will 
be employed, such as a plasmid or viral replication 
5 system recognized by the unicellular host, for example, 
bacteria, yeast, etc., particularly E. coli ; 

"a" is an integer of from 0 to 3, usually 1 
to 3, being 0 where integration into the chromosome is 
desired, although integration can be achieved with 

10 native or foreign replication systems; 

"M" intends a structural gene or cistron, 
referred to as a marker, with its transcriptional and 
translational regulatory signals which provides means 
for selecting host cells which contain the construct; 

15 markers include biocide resistance, such as resistance 
to antibiotics, e.g., ampicillin, chloramphenicol, 
neomycin, G418, or the like, toxins, heavy metals, 
etc.; immunity; complementation providing prototrophy 
to an auxotrophic host, or the like; 

20 "b" is an integer of from 0 to 3 , more 

usually 0 to 2, preferably 1 to 2; j 

"tis" intends the transcriptional initiation 
sequences for regulating transcriptional initiation and 
includes one or more promoters, including the native 

25 promoter by itself or in combination with other 

promoters, e.g., viral promoters or foreign promoters, 
as well as sequences which affect the promoter, such as 
operators, activators, enhancers, capping sequence, 
TATA and CAAT sequences, or the like, where the 

30 sequences will be organized in the construct so as to 
be able to fulfill their function; 

"eis" intends the expression initiation 
sequences for regulating expression and includes any 
ribosomal binding site, the initiation codon as 

35 appropriate, ^oligonucleotides separating the ribosomal 
binding site and initiation codon, where such sequences 
affect expression, and the like; 
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"T-AgR" intends the T-cell antigen receptor 
or a hybrid DNA sequence comprising fragments of the 
T-cell antigen receptor and hybrid DNA sequences, where 
the sequences together provide for an open reading 
5 frame coding for the antigen receptor or hybrid 
protein; 

"ets" intends expression termination 
- sequences, which may include one or more stop codons 
and such other sequences as may be appropriate; and 
10 "tts" intends transcriptional termination 

sequences, which may include the transcriptional 
terminator, normally balanced with the transcriptional 
promoter and may be one or more terminators in 
combination with one or more stop codons, 
15 polyadenylation signal sequence, or the like. 

The T-cell antigen receptor subunit or hybrid 
T-cell antigen receptor will for the most part have the 
following formula: 

20 (S .L. ) c - (V-seq) -J-C- (TM) d « (Cy) e 

r 

wherein: 

"S.L." intends a secretory leader sequence, 
which will encode for about 15 to 25 amino acids, more 
25 usually about 17 to 24 amino acids, and preferably 

about 19 to 23 amino acids having 45 to 75nt, usually 
51 to 72nt, preferably 57 to 69nt; 

u c" is 0 or 1; 

"V-seq" intends a DNA sequence which encodes 
30 for the variable region of the T-cell antigen receptor 
subunit or may be replaced by a sequence encoding for a 
different polypeptide, which DNA sequence will be in 
reading frame with the secretory leader sequence (S.L.) 
as appropriate or may have its own initiation codon in 
35 the absence of the secretory leader sequence; the 

variable sequence will generally be at least., about 60nt 
and not more than about 600nt, more usually not more . 
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than about 400nt; where the sequence codes for a T-cell 
receptor variable region, the sequence will generally 
range from about 270 to 330nt, more usually from about 
285 to 312nt; 

5 "J" intends the joining region and will 

generally be from about 42nt to about 60nt, more 
usually from about 45nt to 57nt, and frequently about 
48 to 54nt, where the J region will be selected from a 
limited number of sequences associated with the joining 

10 region exons of the T-cell antigen receptor subunit; 

."TM" intends the transmembrane integrator 
sequence which will be a hydrophobic sequence of from 
about 51 to 90nt, more usually from about 84 to 96nt; 
"d" will be 0 or 1; 

15 "cy" intends the sequence extending from the 

membrane into the cytoplasm, which will normally be 
from about 12 to 30nt, more usually from about 15 to 
24nt, particularly about 15 to 18nt;' and 
"e" is an integer from 0 to 1. 

20 Each of the two subunits, a- and 8-, may be 

expressed independently in different hosts or in the 
same dost. Where the two subunits are expressed in the 
same host, depending upon whether a microorganism host 
or mammalian host is employed will affect the 

25 processing of the subunits and assembling of the 
subunits into the T-cell receptor. Involved with 
processing is folding, glycosylation, transport through 
the endoplasmic reticulum and Golgi apparatus, cleavage 
with removal of the secretory leader sequence, as well 

30 as capping or blocking of the N- terminus by 

acetylation. As part of the processing or independent 
of the processing folding of the subunits and 
assembling of the subunits into the T-cell receptor 
must occur. With mammalian cells, it is to be expected 

35 that the resulting protein will substantially conform 
with the naturally occurring T-cell receptor in its 
chemical, physical and biological properties. However, 
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with lower eukaryotes and prokaryotes, various of the 
process steps may occur in whole or in part, 
differently or not at all. Therefore, the sequences 
may be modified by replacing the wild type secretory 
5 leader sequence with a secretory leader sequence 

recognized by the expression host or provide for an 
initiation codon at the beginning of the variable 
region. The subunits may then be isolated in the 
cytoplasm and the receptor formed by bringing the a- 

10 and S-subunits together under renaturing conditions, 

DNA coding for receptor subunit fragments 
should encode for a polypeptide of at least 8 amino 
acids, usually at least 15 amino acids (24nt and 45nt, 
respectively) , so as to provide polypeptides having 

15 biological activity, e.g., immunological. 

The constructions, as indicated, can be 
prepared by inserting DNA coding for only a portion of 
a T-cell antigen receptor subunit where the vector has 
one or more appropriate restriction sites, or can be 

20 modified, for example, by adapters, to provide for 
insertion| at an appropriate site in relation to a 
promoter and associated regulatory sequence, e.g., RNA 
polymerase binding site for transcription and to 
appropriate translational regulatory sequences, e.g., 

25 ribosomal binding site, or to be in reading frame with 
a leader sequence. 

The domains or regions of the T-cell antigen 
receptor can be employed individually or in 
combination. By employing cDNA, one can obtain the 

30 gene in open reading frame coding for the preprotein, 
that is the protein prior to processing, such as 
removal of the secretory leader, glycosylation, or the 
like. By restriction mapping the cDNA, one can 
determine the presence of convenient restriction sites 

35 adjacent the borders between the individual domains as 
indicated in the above formula. Where a restriction 
site is not at the border, one can still cleave at a 
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site near the border, using either partial or complete 
digestion as appropriate with the appropriate 
restriction enzyme. Where nucleotides have been 
removed so as to have a truncated sequence , one may 
5 replace the nucleotide (s) employing appropriate 

adapters which allow for joining the domain of interest 
to another nucleotide sequence in proper reading frame. 
Where extra nucleotides are present, one can remove 
these by resection, e.g., employing Bal 31 , by primer 

10 repair, or the like. Alternatively, where there is 

degeneracy in the codon for a particular amino acid, in 
vitro mutagenesis may be employed to modify one or more 
nucleotides which would then provide the proper 
recognition sequence for a restriction enzyme. These 

15 techniques have been extensively described in the 

literature- and do not require exemplification here. 

The DNA sequences which are employed, may be 
the same or different from the sequences isolated in 
accordance with the subject invention. By employing 

20 the J or C domain sequences either by themselves or in 
combination with other sequences, e.g., transmembrane 
sequences, the subject sequences can be used as probes 
for determining the presence of homologous sequences in 
the same or different species and for isolating 

25 sequences having equivalent functions . In this manner, 
a repertory of sequences can be obtained which can be 
joined together to provide for a variety of cistrons 
coding for T-cell antigen receptors or hybrid proteins 
employing varying combinations of fragments from T-cell 

30 antigen receptors. 

By joining the secretory leader sequence of 
the T-cell antigen receptors to a non-wild DNA 
sequence, one can provide for secretion of a hybrid 
protein into the nutrient medium and processing, so as 

35 to obtain a mature protein product from a mammalian 
hoSt. Where the protein is a eukaryotic protein, it 
can be properly processed, so as to provide a product 
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which is the same or substantially the same as the 
naturally-occurring eukaryotic protein • Alternatively, 
if one wishes to provide for specific proteins on the 
T-cell surface or surface of a different mammalian 
5 cell, one can interpose the foreign sequence coding for 
the foreign protein between the secretory leader 
sequence and the transmembrane sequence in place of the 
sequences coding for the variable, J and constant 
regions of the T-cell antigen receptor subunit. In 

10 this way, one can provide for a totally different 

surface membrane protein at the cell surface, modifying 
the surface characteristics of the cell. 

By using the expression products of the 
subject constructs, one can obtain antibodies to the 

15 expression products, which can then be used for 

detecting the presence of T-cell antigen receptors or 
individual subunits, due to sharing idiotypic 
determinants or common determinants to the J or C 
regions . 

20 The cloned DNA sequences, particularly of the 

sequences extending from the 5 '-end of the C region to 
the 3 T -end of the cytoplasmic region can be used as 
probes. Usually, the probes will be at least about 
15nt, more usually at least about 30nt and will 

25 generally not exceed ISOOnt, more usually not exceeding 
about lOOOnt, preferably not exceeding about 500nt of 
homologous sequence . Additional , non-homologous 
flanking sequences may be present which may be up to 
5knt or more. 

30 The nucleotide sequences employed as probes 

may be RNA or DNA and may be labeled in a wide variety 

32 

of ways. Commonly, probes are labeled with P and may 
be detected by autoradiography. Alternatively, biotin, 
novel sugars, or any other molecule may be included by 
35 virtue of the use of synthetic techniques for producing 
the oligonucleotide. Thus, any terminal group may be 
introduced in a simole manner to act as a source of a 
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detectable signal. These groups may be introduced 
directly or indirectly, that is, by covalent bonding, 
ligand-receptor bonding, e.g., haptens and antibodies, 
or the like. Illustrative labels which provide for a 
5 detectable signal include fluorescers, 

chemiluminescers, enzymes, radioactive labels, magnetic 
particles, and the like. 

Two different methods for obtaining rare 
messenger RNA were employed for isolating the rare 
10 messenger RNAs associated with the T-cell antigen 
receptor subunits. The method employed for the 
S-subunit involved the separation of membrane-bound 
polysomal RNA from non-membrane-bound RNA. The 
membrane-bound polysomal fraction of RNA was then 

15 reverse transcribed to produce single-stranded (ss) 

3 2 

cDNA. The cDNA was then labeled with P and 
repeatedly hybridized with B-cell mRNA and fractionated 
on hydroxyapatite. Remaining ss cDNA which passed 
through the column was isolated. A second T-helper 

20 hybridoma was then used to prepare a cDNA library and 
was screened with the cDNA probes prepared j from the 
first T-helper hybridoma. This resulted in substantial 
enrichment for T-cell-specif ic membrane associated 
sequences (about 200-fold) . 

25 The reduced number of selected clones was 

rescreened using the initially prepared probes. The 
positive clones were then nick-translated and 
hybridized to B-cell mRNA under Northern blotting 
conditions. Those clones that did not hybridize to the 

30 B-cell mRNA 1 s were selected as T-cell-specif ic. 

The clones were then employed to investigate 
somatic rearrangements as follows. Those which 
hybridize to RNAs having greater than 100 Ont were 
hybridized to Southern blots of genomic DNA from 

-35 various sources including a helper T-cell hybridoma and 
a thymoma. The DNAs were prepared by standard methods, 
digested with a particular restriction enzyme, in this 
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case, Pvu II , electrophoresed through 0.9% agarose and 
blotted on to nitrocellulose. Moderate to strict 
stringency was employed, and both the thymoma and 
hybridoma were found to give substantially different 

5 patterns from the other non-T-cell DNA. 

The method employed for the a-subunit 
involved a variable-region specific subtracted cDNA 
probe between T-cells of differing specificities. 
Random-primed labeled cDNA from the mRNA of a helper 

10 hybridoma was synthesized. After fragmentation to an 
average size of about 300-400nt, sequences were 
subtracted with mRNA from at least two different 
T-helper hybridoma or T-helper like lymphoma line, 
using hydroxy apatite to separate single from 

L5 double-stranded nucleic acid. The single-stranded cDNA 
remaining was then hybridized to a cDNA library, 
prepared from the cell line providing the original 
cDNA. Further elimination of irrelevant sequences can 
be achieved by rescreening the positive clones with 

20 oligo-dT primed cDNA from the same T-helper hybridoma, 

where the cDNA is reverse transcribed from membrane 

s 

bound polysomal mRNA subtracted with mRNA from a 
macrophage or othe lymphocytic line. Resulting 
hybridizing clones are found to be related to the 

25. variable region of the T-cell receptor. 

By employing hybrid DNA technology the a- and 
B-subunits can be prepared individually or combined as 
a receptor having high specificity and affinity for 
specific conformations of organic molecules, such as 

30 polypeptides, polysaccharides, lipids, haptens and 

combinations thereof. A class of receptors is provided 
analogous to the immunoglobulins which can be used in 
substantially the same way, but lacking properties 
associated with immunoglobulins, such as Fc 

35 determinants, complement associated cytotoxicity, or 
other characteristics specifically associated with 
immunoglobulins. The subject receptors can be used to 
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compete with surface membrane bound T-cell receptors in 
vivo in blood to inhibit proliferation of helper cells 
activated by the homologous antigen. 

The T-cell receptors may be used in most of 
5 . the situations where immunoglobulins find use, such as 
diagnostic assays, affinity chromatography, 
site-directed therapy or diagnosis where the T-cell 
receptor may be conjugated directly or indirectly to 
radionuclides, nmr active compounds, fluorescers, 

10 toxins, e.g., abrin, ricin, etc., or the like. 

By having the genes available for the a- and 
B-chains, the chains and, therefore, the receptors may 
be prepared in large amounts from cells other than 
human cells, which are less fastidious in their growth 

15 requirements than human cells. The T-cell receptos may 
be prepared in bacteria, e.g., E. coli , B. subtilis , 
etc., eukaryotes, e.g., yeast, filamentous fungus, 
murine cells , etc. 

The following examples are of f ered . by way of 

20 illustration and not by way of limitation: 

I 

EXPERIMENTAL 
The procedure for gene isolation of genes 
encoding for helper T-cell antigen-specific receptor 
25 subunits a- and 8- (T H ~Ag receptor, a- or 6-subunit) is 
as follows. 

The isolation of the B-subunit will be 
considered first. Membrane-bound T-helper cell cDNA 
probes was subtracted with.B-cell messenger RNA and 
30 used to screen a cDNA library which was the product of 

another T„-B-cell lymphoma combination. The library 
H 

was constructed as described below for a 
B-cell-specif ic library (Davis et al. , Proc. Natl. 
Acad. Sci. USA (in press)) and by a similar procedure 
35 for xenopus embryonal stage specific library (Sargent 
and Dawid, Science (1983) 2Z2: 135-139 ) using 
T -hybridomas M12 or 2B4 (Hedrick et al. , Cell (1982) 
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30.: 141-152) . The B-cell mRNA was obtained from the 
B-cell lymphomas LI OA and Ball7 (Kim et al. , 
Immunol. (1979) 122? 549-554) . The cDNA was 32 P-labeled 
for detection. The T„-B library was 20-fold enriched 

n 

5 for T-cell-specif ic sequences as judged by the fact 
that 95% of the mass of the cDNA was removed in the 
subtraction at the hydroxyapatite stage. 

(The exemplary procedure for the B-cell 
library follows. The cell lines Ball7, B-cell lymphoma 

10 (IgM + IgD + Ia + ) (Kim et al. , J. Immunol. (1979) 
122 :549-554) and Bal4 , T-cell thymoma 
(Thyl + Lytl~Lyt2 + TL + ) (Kim et al. , ibid. (1978) 
121 :339-344) were grown in RPMI, glut amine, 70% fetal 
calf serum and 5xlO~ 5 M 6-mercaptoethanol in a 5% C0 2 

15 atmosphere. After growing to a high density 

(l-2xl0~ 6 /ml) refreshed with new media for 2 to 4hr, . 
the cells were chilled with PBS and harvested. The 
cells were washed several times in cold PBS, 
resuspended in 0.14M KCl, 0.02M Tris, pH. 8.0, 0.0015M 

20 MgCl 2 , lysed with the addition of NP-40 to 1% and the 
nuclei pelleted. The cytoplasmic fraction was made 
0.5% SDS, 5mM EDTA and extracted 2-3x with saturated 
phenol, once with Sevag (CHC1 3 : isoamyl alcohol 24:1), 
precipitated with ethanol (Mushinski et al. , Proc. 

25 Natl. Acad. Sci. USA (1980) 77:7405-7409) and the 
polyA + RNA selected on oligo-dT cellulose (1-2 
passages) . 

cDNA from the B-cell lymphoma was synthesized 

from 1 to 5yg template polyA + RNA in 50mM Tris, pH 8.3, 

30 6mM MgCl n , 70mM KCl, ImM each dNTP, 32 P-dCTP to give 

5 

first strand specificity of 10 cpm/yg) , lOyg/ml 
oligo-dT, 20mM dithiothreitol, lOOug/ml Actinomycin "D" 
in a 100 ul reaction mixture. Ten units AMV reverse 
transcriptase was added per ug polyA* RNA and incubated 
35 for 2hr at 42°C. After adding an equal volume of 0.2M 
NaOH, the mixture was incubated at 70 °C for 20min, 
cooled on ice, neutralized with 1M HC1 and sodium 
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acetate (pH 6.5) and SDS added to 0 . 2M and 0,1%, 
respectively. At room temperature, the cDNA was 
excluded from G-50F Sephadex in a Pasteur pipette 
column with a running buffer of lOOmM NaCl, 50mM Tris, 
5 pH 7.5, ImM EDTA and 0.02% SDS . Fifty yg of tRNA was 
added as carrier and the cDNA precipitated in a 
silanized Eppendorf tube (1.5ml). The precipitate was 
washed once with 70% ethanol, dried and resuspended in 
0.5M phosphate buffer, 5mM EDTA, 0.1% SDS and 

10 hybridized in sealed glass capillaries with the T-cell 
thymoma RNA at a 10-fold excess at 1 to 1.5mg/ml. To 
absorb repetitive sequences, sheared mouse genomic DNA 
(1.2mg/ml, 10 ug per reaction) was included. After 
boiling for 60sec, the mixture was incubated for 16 to 

15 20hr at 60 °C. Hydroxyapatite chromatography was then 
used to fractionate the material in 0.12M phosphate 
buffer, 0.1% SDS, 60°C. 

The single-stranded fraction was made 
double-stranded with DNA polymerase I (Klenow 

20 fragment) , trimmed with SI nuclease and G-C tailed into 
the PstI site of pBR322. The plasmids were then cloned 
into E. coli at high efficiency (50-400x10 ug/insert) 
with an average insert size about 500nt.) 

The library of 5000 selected clones was 

25 screened and rescreened by standard procedures 

(Maniatis et al. in Molecular Cloning , Cold Spring 
Harbor Press, Cold Spring Harbor, 1982) using the 
membrane-bound T-helper cell cDNA probes from the 
T-hybridoma 2B4 from which sequences common to B-cell 

30 messenger from the B-cell L10A had been subtracted 
(MBT 2B4 -B LlQA ) . Thirty-five definite positives 
resulted, which was about 10% of the library. In order 
to determine which were derived from the same gene and 
which were different, as well as to remove only false 

35 positives, each of these plasmid clones was 

nick-translated and hybridized to representative 
Northern blots. Five were reactive with B-cell mRNA 
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patterns of mRNA size and 
following Table- 



20 

into one of the 10 distinct , 
expression shown in the 



TABLE 



5 








E x p r e s 


sion Pattern 




Clone 


Insert Size 


Message Size 


T-hybridoraa 


T- lymphoma B-lymphoma 




TM4 


0.7 


1.5,1.9,4.5 kb 


+ 


+ - 




TM8 


0.8 


1.9 kb 


+ 


+ 




TM26 


N.D. 


1.0 kb 


+ 


N.D. 


10 


TM28 


0.8 


1.6 kb 


+ 






TM29 


N.D. 


0.6 kb 


+ 






TM30 


N.D. 


0.7 kb 


+ 


+ - 




TM33 


N.D. 


1.7,1.9,3.0 kb 


+ 


+ 




TM86 


0.6 


0.7 kb 


+ 


+ - 


15 


TM90 


0.25 


1.7 kb 


+ 


+ 




TM97 


0.95 


1.8,1.9 kb 


+ 


+ 



* T-hybridoma, 2B4 and C10; T-lymphoma, Bal4 and Ball3; B-lymphoma, 
L10A and Ball7. 

TM8 cross-hybridized strongly with a rat 
20 thy-1 cDNA clone, thy-1 is a classic T-cell membrane 
antigen. 

A cDNA library was now prepared from the 
hybridoma 3.3T (Heber-Katz et al. , supra ) . 

Each of the seven clones which hybridized to 

25 messengers of at least lOOOnt were labeled and 

hybridized to genomic Southern blots composed of DNA 
from the thymoma BW5147 (from the mouse strain AKR, 
Heber-Katz et al. , supra ) , AKR liver, the 
antigen-specific T-cell 2B4 (a fusion of T-cells from 

30 B10. A mice with BW5147) , and B10. A liver. The DNAs 
were prepared by standard methods (Maniatis et al. , 
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electrophoresed through 0.9% agarose and blotted on to 
nitrocellulose. The autoradiograms from the Southern 
blots showed that except for the restriction 
polymorphism between AKR and BIO. A seen with TM8 
5 (thy-1) , the patterns of hybridization with each clone 
were identical for all of the sources of DNA except in 
the case of TM86. There was a strikingly different 
pattern of Pvu II fragments that hybridized to the clone 
from either BW5147 or 2B4, as compared to liver DNA 

10 from either of the parental strains. The clones which 
were surveyed were also hybridized to EcoRI and Hindlll 
digests of genomic DNA and in each case only TM86 
showed a significant difference between the T-cell DNAs 
and liver DNAs . The TM86 clone is PstI excisable from 

15 pBR322. 

To test whether genomic rearrangements of a 
receptor gene were unique for T-cells of different 
antigen specificities, genomic blots consisting of DNA 
from five antigen-specific T-cell hybridomas were 

20 hybridized with a nick-translated insert from clone 
TM86. The results were that DNA from each of the| 
antigen-specific T-cells yielded a unique pattern. 
Three different B-cell lymphoma tumor DNAs gave 
patterns identical to that of the liver indicating that 

25 the rearrangement appeared to be unique to the T-cells. 

Also a series of cytotoxic lines express 
messenger RNAs similar to those of T helper cells (by 
cross-reaction with the gene described here) and also 
display rearrangement of their genomic DNA. 

30 In order to obtain other cDNA clones which 

arose independently in different T-lymphocytes , a 
thymocyte cDNA library was prepared using the lambda 
vector gtlO (generally available from Ronald Davis, 
Stanford University, Stanford, California) . The 

35 library was screened with the TM86 clone using standard 
conditions (Maniatis et al. in Molecular Cloning (Cold 
Spring Harbor Press) Cold Spring Harbor, New York 
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(1982)). The library was constructed from' total 
thymocyte polyA* RNA from young, Balb/C strain mice» 
The cDNA was prepared using AMV reverse transcriptase 
(Amersham) . The cDNA was not methylated, accounting 
5 for cleavage within the mRNA sequence on the 3 '-side. 
After filling in with DNA polymerase I, EcoRI linkers 
were joined to each end. The resulting fragments were 
then fractionated in the desired size range, and 
inserted into the single EcoRI restriction site located 

10 in the phage repressor gene of the lambda vector gtlO.. 
Introduction of a DNA fragment into the repressor gene 
produces cl" phage, which forms a clear plaque. The 
cl + phage forms a turbid plaque, allowing for selection 
of hybrid phage. To eliminate the parent phage from 

15 the gtlO libraries, the bacterial host utilized was 

C^ nft rk~mk*hfl , on which the parent phage forms plaques 

o u u ■ • ^ 

at very low efficiency. The cl parent phage is 
suppressed, while the-cl" hybrid phage plates normally. 
Positively hybridizing recombinants were 

20 subcloned into the Eco RI site of pUC9 (Viera and 

Messing, Gene (1982) 19^: 259-268). Three thymus-deriv-ed 
clones were obtained designated 86T1, 8 6T3 and 86T5. A 
partial restriction map is shown in Figure 1. The 86T 
series molecules all end at the same 3' position 

25 because of an internal Eco RI recognition site proximal 
to the 3' end of the coding sequence. The 5' end 
variation is presumably due to random chain termination 
during library construction. 

Based on the fact that the largest mRNA seen 

30 in Northern blots is 1300nt, subtracting a polyA tail 
of 150-250 nucleotides gives an expected clone size of 
1050-1150nt. Therefore, it* may be concluded that the 
938 nucleotide size of 86T1 should contain most of the 
coding region sequence for a thymocyte molecule. 86T1 

35 was completely sequenced and compared with the partial 
sequences of the other clones as shown in Figure 2. 
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Based on the comparisons a number of 
conclusions may be derived: (1) No two 5' ends are 
alike among the four cDNA clones, yet all have 
identical 3' ends, except as noted in Figure 1. • The 

5 entire constant region of 86T3 was sequenced and was 
found to be identical except for one nucleotide with 
that shown for 86T1. This 5' variable and 3' constant 
region structure is analogous to immunoglobulin cDNA 
clones. (2) There is a stretch of hydrophobic amino 

10 acids immediately following the methionine initiation- 
codon corresponding to the expected leader polypeptide. 
In particular, the sequence Leu-Leu-Leu is common among 
kappa light chain leader polypeptides. (3) A 16 amino 
acid element between the variable and constant regions 

15 is shared at the nucleotide level between 8 6T1 and 
86T5, but not with 86T3 or TM86, suggesting an 
independently assorting J-like region. (4) Placement 
of cysteine and other residues suggests significant 
structural similarity to immunoglobulins and related 

20 molecules. (5) The apparent variable region of 86T3 

appears non- functional because of the many (five) stop 
codons in frame with the otherwise normal constant 
region and, in fact, this clone has stop codons in 
reading frame, indicating that not all transcripts of 

25 this gene are successful in producing a viable 
molecule, at least in the thymus. 

In order to analyze the sequence of this cDNA 
clone for evolutionary relationships with known 
proteins, the derived amino acid sequence was compared 

30 to the Dayhoff protein sequence data bank using the 

rapid comparison programs of Wilbur and Lipman, Proc. 
Natl. Acad. Sci. USA (1983) 80:726-730. From the 
Dayhoff bank of approximately 2300 sequences, the 
homologies of 25 sequences were greater than or equal 

35 to five standard deviations from the mean homology of 
the* data bank. Of these 25 sequences, 24 were 
immunoglobulin constant or variable region sequences 
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and one was a Class II human histocompatibility 
molecule. Furthermore, matches in the variable portion 
of 86T1 are with variable regions of immunoglobulins 
while those in the constant portion are with constant 
5 regions of immunoglobulin. Of 18 invariant residues in 
mouse kappa variable regions (Kabat et al. , supra ) , 13 
are present in the sequence of 86T1 and of the ten 
invariant residues of the heavy chain variable regions, 
six are present in 86T1. The spacing between the 

10 cysteine residues which form the disulfide loops of 

immunoglobulin variable regions is typically 65 amino 
acids for both kappa and lambda light chains and 70 
amino acids for those of the heavy chains. The 
distance between the outermost two cysteines of the 

15 86T1 variable region is an intermediate 68 amino acids. 
The alignment of the different immunoglobulin V regions 
predicts that the leader peptide of 8 6T1 will be 
cleaved just before the asparagine at position 20. 
Pronounced homology was observed to 

20 immunoglobulins throughout what is probably the first 
constant region domaip, particularly around the 
cysteine at position 164. In this region it was 
interesting to note that the sequence immediately 5' to 
the cysteine is homologous to light chains and the 

25 sequence 3 r to heavy chains. Substantial homology is 
also observed to both kappa and lambda light chains 
around the last cysteine (position 260) in the 86T1 
sequence. 

Of the four clones, 16 amino acids are shared 
30 between the 86T1 and 86T5, but not with the other two 
clones sequenced, 86T3 and TM86. This homology falls 
precisely into the region occupied by joining (J) 
region elements in immunoglobulins. The putative J 
region of both 8 6T1 and TM86 show substantial 
35 homologies with all the immunoglobulin J regions. In 
terms of size, the putative J regions are more related 



WO 85/03947 



PCT/US85/00367 



25 

to heavy chains (which average 17 amino acids) than 
light chains (13 amino acids). 

In addition to the J element, the adjacent 5' 
region between amino acids 103-115 has substantial 
5 homology between 86T5 and TM86. In particular the 17 
nucleotide and nine nucleotide identities between these 
two cDNA clones suggest other possible "mini-gene" 
elements possibly analogous to the D region of heavy 
chain immunoglobulins. Alternatively, these homologies 

10 may represent some highly conserved areas of related 
variable region genes, 

A hydropathicity plot (Kyte and Doolittle, J. 
Mol. Biol. (1982) 152:105-132) was performed and 
indicated that: The 86T1 molecule has the alternating 

15 hydrophobic-hydrophilic stretches characteristic of 
globular proteins; the predicted leader polypeptide 
occurs in a hydrophobic environment; a transmembrane 
spanning region is indicated at the end of the 86T1 
sequence, followed by a string of positive charges 

20 (lys-arg-lys) , characteristic of the cytoplasmic 

portion of a number of lymphocyte cell-surface markers. 

In conclusion, the structure of 86T1 is that 
of a 19 amino acid leader polypeptide, a 98 amino acid 
variable region, a 16 amino acid J region and a single 

25 globular constant region domain followed by 

transmembrane and cytoplasmic portions. By analogy to 
immunoglobulins, the two outermost cysteines in each 
globular domain would be linked and the last cysteine 
at position 260 would be bound to the other chain of 

30 the receptor heterodimer. 

It was further found that antisera raised 
against synthetic peptide fragments of 86T1 can 
significantly inhibit the antigen-dependent release of 
IL-2 by T-helper hybridomas. It is therefore concluded 

35 that the locus described above represents a type of 
immunoglobulin gene specifically rearranged and 
expressed in at least some subsets of T-lymphocytes and 
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that it plays a role in the recognition of antigen by 
T-cells. 

The isolation and characterization of the 
T -Ag receptor a-unit will now be described. 
5 Procedures which have been described previously in 

isolating and characterizing the B-subunit will not be 
repeated. 

Calf thymus DNA was used to synthesize random 

3 2 

prime P-labeled cDNA by standard procedures (Maniatis 

10 et al. , supra ) from polyA + cytoplasmic 2B4 mRNA. The 

cDNA was initially 700nt average length and was allowed 
to fragment by autoradiolysis to about 30G-400nt in 
length over a two week period. Subtractive 
hybridization was then carried out employing 

15 hybridization and hydroxyapatite selection with T H 
hybridoma CIO mRNA, followed by the mRNA from the 
T-.-like lymphoma cell line EL-4. The twice subtracted 
probe was then hybridized to a filter of 2B4 cDNA 
library in the vector XgtlO. Approximately 20,000 

20 plaques were screened. Seven positives were picked and 
rescreened with a probe of oligo-dT-primed cDNA made of 
membrane bound polysomal mRNA from 2B4, subtracted with 
mRNA from the P388D1 macrophage line (MBT H «Mac) . 
Three of the seven positives were positive with the 

25 MBT TT-Mac probe and two of these three cross-hybridized 
with each other. One of the cross-hybridizing probes 
was designated TTll and chosen for further study. The 
TT11 cDNA clone was labeled by nick-translation and 
hybridized to a Northern blot containing a panel of 

30 mRNAs as follows: a) Ball7, B-cell lymphoma; b) Ml04e 
plasmacytoma; c) 3T3 , fibroblast line; d) P333D 1/ 
macrophage line; e) 2B4; f) EL-4; g) BW5147. All are. 
polyA + cytoplasmic RNAs prepared by standard 
procedures. A single band was observed at about 1.8kb 

.35 for 2B4, while two bands were observed for EL-4, a 

weaker band at 1.8kb and a second band at 1.3kb in the 
EL-4 lane. 
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To demonstrate that the gene encoding for the 
sequence of TTll was as a result of a rearrangement , 
genomic DNAs from the livers of different mouse 
strains , various T-cell lines and hybrids of the B-cell 
5 lymphoma L10A were digested with a) Hin dlll; b) EcoRV; 
c) Xbal; d) Bgl ll and electrophoresed through 0.7% 
agarose, blotted on nitrocellulose and hybridized by 
standard methods to a probe from the 5 1 half of TTll 
( Eco RI- Eco RV, Fig. 3). Two of the lanes designated FNl 

10 and FNl 3 were from KLH reactive T g hybridomas derived 
from BALB/c x C57B/6 strain mice and the AKR strain 
thymoma line BW5147. A new band appears in a Hindi II 
digest of FNl with respect to parental DNAs, while one 
band in an Eco RV digest of AKR liver DNA disappears in 

15 BW5147 and an EcoRI digest shows a new band appearing 

in BW5147 versus the parental liver DNA. Two bands are 
observed which are polymorphic for an Xba l digest of 
C57B/6 DNA. Both of these bands are present in the FNl 
hybrid, but only one occurs in FN13 , which could only 

20 be the result of a rearrangement or partial deletion of 
the chromosome. A new band in FNl versus the parentals 
is observed in a Bgl ll digest. Although no one digest 
shows evidence of rearrangement in all T-cell DNAs, 
there are enough indications of such events to believe 

25 that TTll is a T-cell receptor-like gene. 

The TTll cDNA clone was partially sequenced 
by the procedure of Maxam and Gilbert, Meth. Enzym. 
(1980) 6J5: 499-560 , utilizing the strategies shown in 
Fig. 3 and the sequences shown in Fig. 3. The clone 

30 was oriented by a polyA stretch (about ISOnt) at the 
3 '-end and sequencing of the 5 f half revealed a long 
open reading frame of 810nt, with an initiation codon 
(ATG) within the first 12nt. This sequence has regions 
similar to the Ig leader, variable region, joining (J) 

35 region, and constant region, just as do the T-cell 

receptor s-chain and the HDS4 gene clone of Saito et 
al., Nature (1984) 309:757-762. 
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Features of the sequence are an extra 
cysteine outside of the putative intra-domain cysteines 
in the constant region (common to the latter two T-cell 
specific genes) , followed by a transmembrane region and 
5 a cytoplasmic region, all of which are encoded as 

separate exons in the 8 -chain genes. There are four 
potential N-linked glycosylation sites, similar to the 
four or five found in different 8-chain sequences. It 
would appear that only three of these four potential 
10 sites are available for glycosylation since the most 
carboxy terminal one is embedded in the transmembrane 
region. 

The exact position of processing of the 
T-cell receptor a-subunit has not been established, but 

15 by analogy to immunoglobulin types, it would be just 
before the glutamine at +1 shown in Fig. 3. 
Alternatively, based on the N- terminal amino acid 
sequence of the human 8-chain from the REX T-cell line 
(Acuto et al. , Proc. Natl. Acad. Sci. USA (1984) 

20 £1:3851-3855) the processing point would be just before 
the asparagine at +3. In the first place, the 
molecular weight would be 27.8kd, which agrees with the 
molecular weight observed by Allison et al. , who 
obtained a molecular weight of 27kd from a murine 

25 a-chain stripped of N-linked sugars with endo F. 

The overall homology with the V and C regions 
of Ig counterparts is relatively low (10-26%) . 
However, many of the conserved residues found in all 
the five known Ig-like genes are present, particularly 

30 in the V region and the J region elements. The spacing 
of the cysteines in the TT11 variable region is 65 
amino acids, which is identical to that of the light 
chains, and the sequence "WYRQ" starting at residue 35 
and "DSA-Y-CAV" at residues 83-91 are also highly 

35 conserved in most I-, G- and T-cell receptor V regions. 
As with the S-chain and HDS4 , the J region in the most 
highly conserved portion, with 7/16 residues homologous 
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to the B-chain concensus sequence (Gascoigne et al. , 
Nature (1984) 310: 387-891) and 9/16 the same as J T 3 , 
(Gascoigne et al. , supra ) . TT11 has the sequence 
"ILLLK" in the transmembrane region, characteristic of 
5 T-cell receptor sequences, where the conservation of a 
charged amino acid (lysine or arginine) in a 
transmembrane is unusual and not found in other members 
of the immunoglobulin super family. 

While not established, there appears to be 

10 strong support that there is a D region present in the 
a-subunit. In particular, just 5 ! to the "SGN " amino 
acid sequence of the J region (which marks the 5 1 
border of J T 3 in the S gene complex) , there is a 
nucleotide sequence "GGGGG" . This is characteristic of 

15 gene D regions of the 6-subunit, where 7 out of 14 
contain runs of between 3-7 Gs on their 3 1 side 
(Tonegawa, Nature (1983) 302^: 575-581). The Northern 
blot data described previously further support the 
presence of a D region. The two bands clearly visible 

20 in the EL-4 lane, and observed in other T R lines, is 
characteristic of the DJC transcript of the B-chain 
which is 300nt shorter than the VDJC transcript 
(Kavaler et al. , Nature (1984) 310:421-423) . 

To establish the ratio of the mRNAs for the 

25 a- and B-subunits, thymocyte, conA (concanavilin A) 

stimulated spleen and 2B4 cDNA libraries were surveyed 
with TT11, HDS4 and C T B probes. Whereas TTll and C T 6 
are present in fairly similar frequencies, 1:1-1:3 in 
the 2B4 and conA spleen libraries, respectively, HDS4 

30 is much rarer. A substantial change in ratio of TTll 
to B-chain in immature versus mature T-cells was 
observed suggesting that TTll gene expression may come 
after expression of B-chain, analogous to light chain 
immunoglobulin expression following that of the heavy 
-35 chain in B-cells. 

It is evident from the above results that 
novel DNA sequences and constructs are provided which 
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provide for the expression of T-cell antigen receptors, 
subunits and fragments thereof. The DNA sequences can 
be used in a variety of ways to produce hybrid 
proteins, which may be retained as surface membrane 
5 proteins, can be labeled to provide for probes for 
determining lymphocyte origin or type, for isolating 
DNA sequences from T-cells, for use as primers for 
producing DNA sequences coding for the T-cell receptor 
subunits, or for use for secretion of foreign proteins 

10 from a mammalian host cell- The peptides can be used 
for the production of antibodies for isolation of 
T-cell antigen receptors, for removal of T-cells from 
cell mixtures, for identification of T-cells, or for 
binding to T-cells in vivo or in vitro , so as to affect 

15 their viability, proliferation, secretion of factors, 
or the like* 

Although the foregoing invention has been 
described in some detail by way of illustration and 
example for purposes of clarity of understanding, it 

20 will be obvious that certain changes and modifications 
may be practiced within the scope of the appended 
claims. 
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WHAT IS CLAIMED IS: 

1. A cDNA sequence of from about 50nt to 
2knt coding for at least one fragment of a T-cell 
antigen receptor. 

5 2. A DNA sequence of less than about 20knt 

coding for a T-cell antigen receptor or fragment 
thereof. 

3. A DNA sequence according to Claim 2, 
wherein said coding is for the a-subunit. 

10 4. A DNA sequence according to Claim 2, 

wherein said coding is for the B -subunit. 

5. A DNA sequence according to Claim 2, 
joined to non-wild type DNA to form hybrid DNA 
including a replication system recognized by a 
15 microorganism. 

i 6. A DNA sequence according to Claim 1, 

labeled to provide a detectable signal. 

7. A DNA sequence of at least 15nt and 
coding for at least a portion of the constant region of 

20 a T-cell antigen receptor subunit. 

8. A DNA sequence according to Claim 6, 
wherein said subunit is the a-subunit. 

9. A DNA sequence according to Claim 6, 
wherein said subunit is the B-subunit. 

25 10. A DNA sequence according to Claim 7 

bound to non-wild type DNA. 
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11. A DNA sequence according to Claim 10 , 
wherein said constant region coding DNA and said 
non-wild type DNA are separated by a sequence coding 
for a J region. 

5 12. A cDNA sequence of from about 15 to 

2000nt coding for at least one specific domain of a 
T-cell antigen receptor subunit. 

13. A cDNA sequence according to Claim 12 
joined to transcriptional and translational regulatory 

10 signals for expression. 

14. A cloning vector for cloning in a 
prokaryotic host including a cDNA sequence according to 
Claim 12 and a marker for selection. 

15. An expression vector including 

15 transcriptional and translational signals and a cDNA 
sequence according to Claim 12 under the 
transcriptional and translational control of said 

i 

signals* 

16. A DNA sequence of at least about 15nt 
20 present in the sequence of 86T1 in Figure 2 and joined 

to non-wild type DNA. 

17. A DNA sequence of at least about 15nt 
present in the sequence of TT11 in Figure 3 and joined 
to non-wild type DNA. 

25 18. A method for obtaining cDNA encoding for 

at least a portion of a T-cell antigen receptor subunit 
which comprises: 

isolating membrane bound polyA* RNA from a 

T-cell ; 

30 preparing cDNA from said membrane bound RNA; 
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hybridizing said cDNA with messenger RNA from 

a B-cell; 

separating single-stranded cDNA from hybrid 
duplex cDNA/RNA; 
5 cloning said single- stranded cDNA to obtain 

T-cell specific cDNA; 

hybridizing said cloned T-cell specific cDNA 
with restriction fragments of genomic T-cell DNA and 
genomic non-T-cell DNA; and 
10 isolating cloned cDNA hybrizing to T-cell 

fragments but not non-T-cell fragments. 

19. A method for obtaining cDNA encoding for 
at least a portion of a T-cell antigen receptor subunit 
which comprises: 
15 preparing random primed cDNA from polyA 

cytoplasmic mRNA from a first T-helper cell; 

fragmenting the cDNA to an average size of 
about 300-400nt; 

subtraction hybridizing with mRNA from at 
20 least two T-helper or T-helper like cells to provide a 
probe mixture enriched for T-cell receptor variable 
region DNA; and 

hybridizing said probe mixture with a cDNA 
library made from said first T-helper cell, where 
25 hybridizing clones are selected as putative T-helper 
cell receptor subunit genes. 
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